Improved Katz Smoothi Modeling in Speech
نویسندگان
چکیده
In this paper, a new method is proposed to improve the canonical Katz back-off smoothing technique in language modeling. The process of Katz smoothing is detailedly analyzed and the global discounting parameters are selected for discounting. Further more, a modified version of the formula for discounting parameters is proposed, in which the discounting parameters are determined by not only the occurring counts of the n-gram units but also the low-order history frequencies. This modification makes the smoothing more reasonable for those n-gram units that have homophonic (same in pronunciation) histories. The new method is tested on a Chinese Pinyin-to-character (where Pinyin is the pronunciation string) conversion system and the results show that the improved method can achieve a surprising reduction both in perplexity and Chinese character error rate.
منابع مشابه
Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition
Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...
متن کاملImproved katz smoothing for language modeling in speech recogniton
In this paper, a new method is proposed to improve the canonical Katz back-off smoothing technique in language modeling. The process of Katz smoothing is detailedly analyzed and the global discounting parameters are selected for discounting. Further more, a modified version of the formula for discounting parameters is proposed, in which the discounting parameters are determined by not only the ...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملComparison of the Formalin- Ether and the Katz Kato in The Parasitological Diagnosis of Human Fascioliasis
ABSTRACT Human fascioliasis is an endemic disease in the Guillan province and it has became a public health problem in the port of Anzali. Thousands of people in this area were infected during the two outbreaks in 1987 and 1998. If the mature worm settles in the liver, finding the ova in the stool is the most reliable diagnosis of the infection. Therefore selecting the most sensitive parasito...
متن کاملOn enhancing katz-smoothing based back-off language model
Though the statistical language modeling plays an important role in speech recognition, there are still many problems that are difficult to be solved such as the sparseness of training data. Generally, two kinds of smoothing approaches, namely the back-off model and the interpolated model, have been proposed to solve the problem of the impreciseness of language models caused by the sparseness o...
متن کامل